Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event Recognition
نویسندگان
چکیده
The proposed Missing Feature Linear-Frequency Cepstral Coefficients (MF-LFCC) is a noise robust cepstral feature that transforms both clean and noisy signals into a similar representation. Unlike conventional Missing Feature Techniques, the MF-LFCC does not require the substitution of spectrogram elements (imputation) or classifier modification (marginalization). To improve the noise mask used in the MF-LFCC, we propose to use the computer vision technique of blob detection to identify the peaks characterizing the sparsity of sound event spectrograms. For single sound event recognition using SVM classifiers, the MF-LFCC is shown to significantly outperform the MFCC baseline and the noise robust ESTI Advanced Front End feature in noisy conditions.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملUsing Mel-Frequency Cepstral Coefficients in Missing Data Technique
Filter bank is the most common feature being employed in the research of the marginalisation approaches for robust speech recognition due to its simplicity in detecting the unreliable data in the frequency domain. In this paper, we propose a hybrid approach based on the marginalisation and the soft decision techniques that make use of the Mel-frequency cepstral coefficients (MFCCs) instead of f...
متن کاملSelective gammatone filterbank feature for robust sound event recognition
This paper introduces a novel feature based on the raw output of the gammatone filterbank. Channel selection is used to enhance robustness over a range of signal-to-noise ratios (SNR) of additive noise. The recognition accuracy of the proposed feature is tested on a sound event database using a Hidden Markov Model (HMM) recogniser. A comparison with a series of similar features and the conventi...
متن کاملDWT and LPC based feature extraction methods for isolated word recognition
In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been derived from the speech frames decomposed using discrete wavelet transform. LPC coefficients derived from subband decomposition (abbreviated as WLPC) of speech frame provide bette...
متن کاملPhysiologically Motivated Feature Extraction for Robust Automatic Speech Recognition
In this paper, a new method is presented to extract robust speech features in the presence of the external noise. The proposed method based on two-dimensional Gabor filters takes in account the spectro-temporal modulation frequencies and also limits the redundancy on the feature level. The performance of the proposed feature extraction method was evaluated on isolated speech words which are ext...
متن کامل